Lexicalized Stochastic Modeling of Constraint-Based Grammars using Log-Linear Measures and EM Training

نویسندگان

  • Stefan Riezler
  • Detlef Prescher
  • Jonas Kuhn
  • Mark Johnson
چکیده

We present a new approach to stochastic modeling of constraintbased grammars that is based on loglinear models and uses EM for estimation from unannotated data. The techniques are applied to an LFG grammar for German. Evaluation on an exact match task yields 86% precision for an ambiguity rate of 5.4, and 90% precision on a subcat frame match for an ambiguity rate of 25. Experimental comparison to training from a parsebank shows a 10% gain from EM training. Also, a new class-based grammar lexicalization is presented, showing a 10% gain over unlexicalized models.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lexicalized Stochastic Modeling of Constraint - Based Grammarsusing Log - Linear Measures

We present a new approach to stochastic modeling of constraint-based grammars that is based on log-linear models and uses EM for estimation from unannotated data. The techniques are applied to an LFG grammar for German. Evaluation on an exact match task yields 86% precision for an ambiguity rate of 5.4, and 90% precision on a subcat frame match for an ambiguity rate of 25. Experimental comparis...

متن کامل

An Empirical Evaluation of Probabilistic Lexicalized Tree Insertion Grammars

We present an empirical study of the applicability of Probabilistic Lexicalized Tree Insertion Grammars (PLTIG), a lexicalized counterpart to Probabilistic Context-Free Grammars (PCFG), to problems in stochastic naturallanguage processing. Comparing the performance of PLTIGs with non-hierarchicalN -gram models and PCFGs, we show that PLTIG combines the best aspects of both, with language modeli...

متن کامل

Estimators for Stochastic "Unification-Based" Grammars

Log-linear models provide a statistically sound framework for Stochastic “Unification-Based” Grammars (SUBGs) and stochastic versions of other kinds of grammars. We describe two computationally-tractable ways of estimating the parameters of such grammars from a training corpus of syntactic analyses, and apply these to estimate a stochastic version of LexicalFunctional Grammar.

متن کامل

Estimators for Stochastic \Uni cation-Based" Grammars

Log-linear models provide a statistically sound framework for Stochastic \Uniication-Based" Grammars (SUBGs) and stochastic versions of other kinds of grammars. We describe two computationally-tractable ways of estimating the parameters of such grammars from a training corpus of syntactic analyses, and apply these to estimate a stochastic version of Lexical-Functional Grammar.

متن کامل

Feature Selection for a Rich HPSG Grammar Using Decision Trees

This paper examines feature selection for log linear models over rich constraint-based grammar (HPSG) representations by building decision trees over features in corresponding probabilistic context free grammars (PCFGs). We show that single decision trees do not make optimal use of the available information; constructed ensembles of decision trees based on different feature subspaces show signi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره cs.CL/0008034  شماره 

صفحات  -

تاریخ انتشار 2000